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ABSTRACT 

A single neuron is categorized as“multisensory” if there is a statistically significant difference between the response evoked 
by an audio-visual stimulus combination and that evoked by the most effective of its components individually. Crossmodal 
enhancement is commonly expressed as a proportion of the strongest unisensory response. However, being responsive to 
multiple sensory modalities does not guarantee that a neuron has actually engaged in integrating its multiple sensory inputs, 
rather than simply responding to the most salient stimulus. Here, we propose an alternative index measuring by how much the 
crossmodal response surpasses the level obtainable by optimally combining the unisensory responses. Optimality is defined 
by probability summation combining the unisensory responses under maximal negative stochastic dependence. The new index 
is analogous to measuring crossmodal enhancement by the amount of violation of the “race model inequality”, which is widely 
used in reaction time studies of multisensory integration. Neurons previously labeled as “multisensory” may lose that property 
since the new index tends to be smaller than the traditional one. This is exemplified with a data set collected from single SC 
neurons. The new easy-to-compute index does not require any specific distributional assumption. It is sensitive to the variability 
in the data, in contrast to the traditional index which, by definition, only depends on the means of the uni- and crossmodal 
response distributions. 


Introduction 

Single neurons in the deep layers of the mammalian superior colliculus (SC) integrate afferent visual, auditory, and somatosen¬ 
sory cues and generate efferent motor commands to structures innervating the musculature of, e.g., the eyes and hands. 1,2 
Multisensory integration is defined operationally as the neural process by which unisensory signals are combined to produce a 
multisensory response that is significantly different from the responses evoked by the modality-specific component stimuli. 3 
For example, at the level of a single superior colliculus (SC) neuron, response strength has traditionally been measured as 
the absolute number of impulses (spikes) registered within a fixed time interval after stimulus presentation or, sometimes, 
by the firing rate within this interval. A neuron is categorized as being “multisensory” if the absolute number of spikes to a 
cross-modal, e.g. visual-acoustic, stimulus combination is significantly higher (or, in case of inhibition, lower) than the number 
of spikes evoked by the most effective of its components individually. 2 Moreover, if a neuron responds, for example, to visual 
but not to auditory stimulation and if the response to a visual-auditory combination differs significantly from the response to the 
visual stimulus, it is also considered being “multisensory”. 

Up to date, the most widely used descriptive measure of the magnitude of multisensory integration is the crossmodal 
enhancement index (CRE), also termed crossmodal interaction index. It is defined as 


™r, CM SM max 

CRE =-x 100, 
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( 1 ) 


where CM is the mean number of spikes in response to the crossmodal stimulus and SM max is the mean number of spikes to 
the most effective modality-specific component stimulus. 4 Thus, CRE expresses crossmodal enhancement as a proportion of 
the strongest unisensory response. Some modifications of CRE have been proposed as well. '’ Prominently, in the “additive 
model”, term SM max in Equation (1) is replaced by the sum of the unisensory responses. 6 The additive version has raised some 
controversy because, under some modeling assumptions, an additive combination of cross-modal inputs yields a prediction of 
optimal multisensory integration. 7 Thus, observing that a neural circuit is actually engaged in optimal multisensory enhancement 
but does not achieve “superadditivity”, would lead one to conclude that no multisensory integration has taken place. Similarly, 
any crossmodal response larger than the largest unisensory response but smaller than the sum might be misinterpreted as 



response depression. 8 In summary, the issue of exactly how to measure the amount of multisensory integration has been under 
debate for some time. 5,8 ’ 9 

The purpose of this paper is to suggest an alternative to existing measures of crossmodal response enhancement. While 
having descriptive value, the main weakness of CRE and related measures is that they lack a commonly accepted theoretical 
basis. Such a basis is essential since merely being responsive to multiple sensory modalities does not guarantee that a neuron 
has actually engaged in integrating its multiple sensory inputs, rather than simply responding to the most salient stimulus 
modality. As Stein and colleagues 8 (ibid, p. 114) have put it, “At the time of the early physiology studies in the 1980s, it 
was considered possible that these neurons only represented a common route by which independent inputs from a variety of 
senses could gain access to the same motor apparatus in generating behavior (e.g., possibly employing a “winner-take-all” 
algorithm).” 

Given that the actual computations performed by a multisensory neuron are still not fully understood," 1 developing a new 
measure should not depend on specific assumptions about the multisensory integration process. The suggestion offered here is 
a measure that compares the mean observed cross-modal response of a neuron with the largest mean achievable by optimally 
combining its unisensory responses, but without actually integrating them. Specifically, it measures by how much a neuron 
integrates information above the level obtainable by an optimal “winner-take-all” algorithm, as mentioned in the above quote. 
Note that the new measure does not presuppose that a neuron follow a specific operational mode. Rather, it takes the result of 
a potential probability summation mechanism as a benchmark to define the maximal enhancement that can be predicted by 
separately combining unisensory information streams. Because this measure generally is more restrictive than the traditional 
CRE, many neurons previously categorized as “multisensory” risk losing that property. 

In order to motivate the new definition, we first consider an established measure of crossmodal enhancement in behavioral 
data, the race model inequality for reaction times. A numerical measure derived from that inequality turns out to be completely 
analogous to the measure proposed here for neural data. After introducing the new index, its properties are illustrated on a 
sample of spike data (Mark Wallace, personal communication, July 18, 2015). and compared to the traditional index. Finally, 
the special case of Poisson distributed spikes serves to demonstrate that, in contrast to the traditional index, the new one takes 
the variability of the data into account as well. 


Measuring crossmodal enhancement of reaction time 

In the redundant signals paradigm, stimuli from two (or more) different modalities are presented more or less simultaneously, 
and participants are instructed to respond to a stimulus of any modality, whichever is detected first. Besides comparing relative 
detection frequencies of unimodal vs. crossmodal stimuli, behavioral response strength is most often measured by reaction time 
(RT), that is, the time it takes a participant to respond (e.g., via button press) to a suddenly appearing stimulus, often visual 
or acoustic. Typically, time to respond in the cross-modal condition is shorter than that in either of the unimodal conditions. 
A significant reduction of mean RT to the cross-modal stimulus, compared with the faster of the unimodal mean RTs, has 
been taken as evidence for some true multisensory processing (“coactivation”) underlying the cross-modal reaction times. 11 In 
analogy to CRE at the neural level, the index of crossmodal response enhancement for reaction time (CRErt) is defined as 12-15 

/-inr- R^min RTcM I /'-i\ 

CRE rt =-— -x 100, (2) 
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where RTcm is the mean RT to the cross-modal stimulus and A7 mm is the faster of the unimodal mean RTs. Thus, CRErt 
expresses multisensory enhancement as a proportional reduction of the faster unisensory response by the cross-modal response. 
For concreteness, we rewrite CRErt for the case of visual-auditory stimulation, with ERTy ,ERT A , and ERT VA denoting expected 
(mean) reaction time to the visual stimulus, the auditory stimulus, or the visual-auditory stimulus combination, respectively. 
CRErt then becomes 


CRErt 


min{ER7V,ER7)i} — ERT VA 
min{ E R Ty , E R T A } 


x 100 , 


(3) 


Just as neural measure CRE of Equation (1), index CRErt has descriptive value. For example, CRErt =10 means that response 
to the visual-auditory stimulus is 10 % faster than the faster of the mean responses to unimodal visual and auditory stimuli. 


The race model 

Interestingly, it has been recognized early on 1(1 that simply comparing crossmodal and unimodal mean RTs is not diagnostic 
with respect to a presumed underlying multisensory integration process, for the following reason. Let us assume that in the 
crossmodal condition, (i) each individual stimulus elicits a process performed in parallel to the others and, (ii), the finishing 
time of the faster process determines the observed RT. In this so-called race model, no actual integration of the unimodal 
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processes takes place but, nevertheless, mean RT in the crossmodal condition is predicted to be shorter than the faster of the 
unimodal mean RTs, due to “statistical facilitation” (aka “probability summation” or “winner-take-all” mechanism). In order to 
gauge whether observed crossmodal RTs are faster than predicted by statistical facilitation, Jeff Miller 1 '■ 17 proposed the race 
model inequality (RMI) test, 

P(min{V,A} < t) < P(V < t)+P(A < t ) or, Fy A {t ) < Fy(t) +Fa(i) for all t, t > 0. (4) 

Here V and A denote visual and auditory processing times, respectively, with Fy, Fa the corresponding unimodal RT distributions, 
and Fva the distribution of the RTs in the crossmodal (visual-auditory) condition, ignoring possible other stages, like response 
preparation, of observable RT. Violation of Equation (4) at any time point t is evidence in favor of some form of multisensory 
integration taking place above statistical facilitation, often termed “coactivation”. Note that stochastic independence between 
the processing times V and A is not required, but the test is valid only if an assumption of “context independence” holds: the 
distributions of V and A in the unimodal conditions must equal their corresponding marginal distributions in the crossmodal 
condition 1 19 (see next subsection). 

The race model inequality has become the standard tool for testing whether observed reaction times to crossmodal stimuli 
are faster than predicted by a simple statistical facilitation mechanism. Gondan and Minakata 20 report 83 studies from 2011 
to 2014 performing the inequality test using a variety of statistical methods. Because, unlike CRE, Inequality (4) does not 
represent a single numerical measure of the amount of crossmodal enhancement, it has become practice to compute the 
following geometric measure: the area S between Fva and Fy +Fa defined by all t values where the race model inequality is 
violated: 

/*oo 

S= / 1 c(t)dt with C = {/ : FvaU) > min{Fy(f) + F A (t), 1}. (5) 

Jo 

The sample estimate of area S is then taken as index of the strength of violation of the inequality. Notably, a brief discussion of 
the race model inequality in the next section reveals that area S can be interpreted as the expected value of random variable 
min{ V.A } (under maximal negative dependence) and estimating S is rather straightforward not requiring any geometric 
argument (for details, see also 21 ). 

Context independence and coupling of random variables 

Sometimes, instead of Equation (4), a more restrictive inequality is tested, 

Fva(0 < Fy (f) T F a (/) Fy (f) * Fa {t ), (6) 

assuming stochastic independence between V and A. This raises the general question of how the random variables in the 
unimodal conditions, V and A, related. Actually, as already observed by R.D. Luce [18, p. 130], there exists -a-priori- no 
stochastic relation between them: the probability measures for V and A, Fy and Pa, are defined on different probability spaces, 
thus V and A are stochastically unrelated: there is no empirical context (e.g., trial number) in which a unimodal event {V < ,v} 
co-occurs with a unimodal event {A < t) to define ajoint distribution for (V,A). Nevertheless, such a joint distribution can 
always be constructed by the stochastic concept of coupling. A coupling of random variables V and A is a pair of random 
variables (V,A) with a bivariate distribution function ll\/ ; \(s.t) such that its marginal distributions are identical to Fy and F A 
respectively, i.e., 

V = V and A = A, 

where = means “equality-in-distribution”. Thus, existence of a coupling is equivalent to the assumption of “context indepen¬ 
dence” mentioned above. Inequality (6) corresponds to an independent coupling of V and A with 

Hy A (s,t) = Fy{s) *F A (t), 

but there exists an infinite number of possible couplings 1 . The “trick” is to find a dependence structure that fits one’s purposes. 
For the race model Inequality (4), which can be written equivalently as 

F VA (t) < min{Fy(f) +F A (t), 1}, t > 0, 

it turns out that the right-hand side corresponds to the coupling of V and A generating maximal negative stochastic dependence 
between the two random variables. Moreover, the area S between Fy A and min{Fy(f) +F A (t), 1} equals the expected value of 
random variable min{V,A}, i.e., 

S = E~min{V,A}, 

under maximal negative dependence between V and A, with superscript indicating maximal negative dependence. 

1 For a comprehensive treatment of the theory of coupling, see . 22 
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CRE of RT under maximal negative dependence 

A measure of crossmodal response enhancement for reaction times, based on maximal negative dependence, can then be defined 
by replacing min{E/?7y, ER7 A } in Equation 3 by area S, yielding: 


CRE rt = 


E min{V,A} — ERTy A 


x 100 . 


E-min{V,A} 

Because E~ min{V,A} < min{ER7V,ER7)i}, it follows that 


(7) 


CRE rt < CRErt 

always. In other words, the new index of crossmodal response enhancement for RT is more conservative than the traditional one. 
Proof of these statements, being analogous to the one given for spike counts in the next section, is omitted here, but see. 19,23,24 


Measuring crossmodal enhancement in single neurons 


To fix ideas, let Ny, Na , and N V a denote the random number of impulses (spikes) emitted in a given time interval by a 
neuron, following unisensory (visual, auditory) and crossmodal (visual-auditory) stimulation, respectively, without assuming 
any specific parametric distribution for these random variables. Inserting their expected values into the traditional CRE of 
Equation (1) yields 


CREsp 


E Nva — max{EAV, E Na } 
max{EAV,EA(4} 


x 100 , 


( 8 ) 


where subscript SP indicates measurement of spikes. At the level of samples, the expected values are replaced by arithmetic 
averages. 

Realizations of random variables Ny and Na , with distribution functions Gy and G,\ , respectively, are collected across 
experimental trials under different stimulus conditions (unisensory and bisensory). Thus, as observed above for reaction times, 
they refer to distinct probability spaces and there is -a-priori- no natural way to combine the results from unisensory visual and 
auditory trials. In particular, any assumption about stochastic (in-)dependence between Ny and Na is void. Nevertheless, one 
can define a stochastic coupling of the two random variables. Coupling of Ny and Na here amounts to defining a distribution 
HyA for a bivariate random vector (N\/,N A ) in such a way that its marginal distributions are identical to Gy and G A . 

Let HyA(m,n) = P(Ny < in. N A < n), ni.n = 0,1,..., be the distribution for some coupling of Ny and N A . As a bivariate 
(discrete) distribution, it obeys the Frechet inequalities valid for any distribution: 23 


max{0, Gy(m) + G A {n) — 1} < H VA (m,n) < min {Gy(m),GA(n)}, 


(9) 


for all iti.n = 0,1,.... Setting m = n, we get 
H V A(mpn) = P(max{N v ,N a } < m), 


and from (9), 

H~(m) = max{0,Gp(m) +Ga('«) — 1} < H VA (m,m) < min{Gv(m),G/i(m)} = H + (m), (10) 

for m = 0,1,_In (10) both upper bound H + (in) and lower bound H (in) are univariate distribution functions of random 

variable max{/Vy ,/Va}. Moreover, it is well known 26 that H + and H represent distributions with maximal positive, respectively 
negative, dependence between Ny and N A , assuming non-degenerate marginal distributions Gy and G A . 


Proposition 1 Under any coupling of the univariate response random variables Ny and Na, the following bounds hold for 
expected value Emax{lVy,A( 4 }, 

max{EAV,ElVA} < Emax{AV,A(4} < E~ max{Ap,A(4}, 

where E max{AV,A( 4 } is the expected value under maximal negative dependence between the univariate response random 
variables. 


To prove the right-hand bound of the proposition, rewrite Equation (10) as 
1 — H + (m) < 1 — Hy A (m,m) = P(max{AV,A( 4 } > m) < 1 
for m = 0,1,.... Summing over all in yields the result 

[1 —HyA(m,m)\ = Emax{AV,A(4} < E~ max{Ny,NA}- 

m =0 

The left-hand bound, max{EAV. E/V/i} < Emax{/Vi/.AA } follows directly from Jensen’s inequality (see, e.g., 27 p. 51). 
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CRE in single neurons under maximal negative dependence 

From Proposition 1 it is clear that the sample value of E~ max{Ay,Ay} is the largest mean obtainable from combining the 
unisensory responses via probability summation. Replacing maxjEAy ,EAy} by E~ max{Ay,Ay} in the traditional CRE index 
results in the new index 


CRE sp = 


EAyy — E max{ Ay, Na } 
E~ max {Ay, Ay} 


x 100. 


( 11 ) 


This new index measures the degree by which a neuron’s observed multisensory response surpasses the level obtainable by 
optimally combining the unisensory responses (assuming that the neuron simply reacts to the more salient modality in any 
given cross-modal trial). The test for multisensory enhancement then amounts to comparing the observed mean number of 
impulses to crossmodal stimulation with the estimate for E max {Ay, Ay }. For empirical data, the expected value ENva is 
replaced by the sample mean of crossmodal responses and E~ max{Ay, Ay} is estimated using the method of antithetic variates 
as demonstrated below (see also 27 ). 


Two important consequences 

Applying the new index has two important consequences. First, given that the new index is obviously always smaller or equal 
to the traditional index, 

CRE SP < CREsp, 

some neurons previously labeled “multisensory” may lose that property under the new index. This is illustrated with an 
empirical data set following the next section. 

Second, from the definition of CREsp it follows that changing the variability of the unisensory responses while leaving 
max{EAy,EAy} invariant, will not affect the value of CRE. In contrast, the new index being based on E~ max {Ay, Ay} can be 
sensitive to such changes. This is illustrated here for the case of Poisson distributed spikes. 


Example: Poisson-distributed spikes 

Let the spike counts Ay and Na follow a Poisson distribution, i.e., 

X m 

P(Nj = m) = exp[—Af] — l — for m = 0,1,2_ (12) 

with i = V or i = A. For this distribution, ENj = A,- and, for the variance, VarA, = A, as well. The traditional index can thus be 
written as 


CREsp 


EN va ~ max{ VarAy, VarAy } ^ |Q() 
max{ VarAy, Var Na } 


(13) 


We assume, without loss of generality, that VarAy < VarAy. Obviously, increasing VarAy will not change the value of CREsp 
as long as VarAy is not strictly larger than VarAy. In contrast, as will now be shown, E~max{Ay,Ay}, and therefore CREgp as 
well, will not remain invariant with VarAy increasing. 

Inserting into the expected value yields 


E max{Ay,Ay} = £ [1 —A (m)\ = £ 


1 — max 


0,£P(Ay = k) + £P(Ay=k)-l 


m =0 


m =0 


k =0 


k =0 


For given values of parameters Ay and A a, approximate computation of this expected value is simplified by using the fact 28 that 
the (cumulative) distribution for the Poisson is expressed in terms of the incomplete gamma function. Specifically, for i = V,A: 


m 


£p(A,=A)=r(m+l,A,)/r(m). 

Ar=0 


Here, the ratio T(m+ l,A,)/r(m) is the regularized incomplete gamma function with T(in) 
incomplete gamma function 



e 


t t m ~l 


dt. 


(m — 1)! and r(/w,A,) the 


(14) 


For illustration of the effect, we choose specific, but otherwise arbitrary, parameter values: EAy A = 30 and, for VarAy = Ay = 22 
and VarAy = Ay = 26, we varied VarAy = Ay between 5 and 22 and 26, respectively. Table 1 lists the corresponding values 
of CREsp as a function of VarAy and VarAy as well as the CREsp for the two different values of VarAy. Notably, increasing 
VarAy = Ay corresponds to a strong decrease in CREsp, whereas CREsp remains invariant against such increase in variability 
of Ay. 
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Ay 

Aa 

CRE sp 

CRE S p 

22 

5 

36.3 

36.4 

22 

10 

35.1 


22 

16 

29.0 


22 

22 

16.6 


26 

5 

15.4 

15.4 

26 

10 

15.0 


26 

16 

12.7 


26 

22 

6.3 


26 

26 

-0.2 



Table 1 . Poisson-distributed spike counts. Values of CRE sp are shown as a function of X.\ = Var Na and two fixed values of 
Ay = Var Ny- CRE sp decreases with increasing variability of Na, whereas CREsp remains constant. 


Empirical data 

First, we illustrate the computation of CRE<T p and CREsp for a single data set, recordings from a cat superior colliculus (SC) 
neuron, followed by a comparison of both indexes on a larger number of such neurons. All data has been obtained from the lab 
of Mark T. Wallace 29 

Computing CRE SP and CRE SP for data from a single neuron 

The data set consists of the total number of spikes, recorded within a response window, that occurred from visual, auditory, and 
visual-auditory stimulation in N = 20 trials, respectively (details in Table 2). Spike numbers in the left-hand columns of Table 2 
include spontaneous activity (S.A.), whereas the right-hand columns show the same recordings after S.A. was removed. 

Note that a-priori there is no fixed correspondence between trial number and the individual values of V and A. The antithetic 
variates method involves pairing the unisensory responses, sorted by increasing order (V) and by decreasing order (A), and 
computing max(V, A) for each pair. Their mean value represents an estimate of E max{ Ny . N A } , that is, of the maximum 
expected value from combining the unisensory responses achievable via negatively dependent probability summation. The trial 
numbering of the VA values remains arbitrary. 

Computing the traditional CREsp value by inserting the estimates from Table 2 in Equation 8 , i.e., replacing the expected 
values by the means, yields 


CREsp = 


ENva — max{EVy, E Na } 
max{EAV,EA(4} 


x 100 


19.15 — max{8.05,5.75} 
max{8.05,5.75} 


x 100 = 137.89[%]. 


for spike numbers containing S.A (left-hand columns). The corresponding value for the new index is estimated by inserting the 
estimates from Table 2 in Equation 11 , 


= EN VA -E-m»{Ny N A } 
sp E~ max{Nv,NA} 


19.15-8.85 

8.85 


x 100= 116.64[%], 


The corresponding values for responses with S.A. removed (right-hand columns) amount to 
16.083-max{6.163,5.243} 


CREsp: 


max{6.163,5.243} 


x 100= 160.96[%], 


and 


CRE sp 


16.083-7.484 

7.484 


x 100= 114.90[%], 


The results are quite clearcut. For this neuron, replacing CREsp by CRE sp corresponds to a drop from about 161% to about 
115% with spontaneous activity removed, and from about 138% to about 117% when spontaneous activity was retained. Thus, 
applying the new index may well lead to dropping the “multisensory” label for this neuron depending, of course, on one’s 
criterion for attaching that label. 
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Spike numbers 



Spike numbers w/o S.A. 

trial 

V 

A 

max(V, A) 

VA 

V 

A 

max(V, A) 

VA 

1 

3 

8 

8 

11 

1.113 

7.493 

7.493 

18.933 

2 

4 

8 

8 

22 

2.113 

7.493 

7.493 

13.933 

3 

5 

7 

7 

17 

3.113 

6.493 

6.493 

15.933 

4 

5 

7 

7 

19 

3.113 

6.493 

6.493 

14.933 

5 

5 

7 

7 

18 

3.113 

6.493 

6.493 

9.933 

6 

6 

7 

7 

13 

4.113 

6.493 

6.493 

14.933 

7 

6 

6 

6 

18 

4.113 

5.493 

5.493 

7.933 

8 

7 

6 

7 

11 

5.113 

5.493 

5.493 

22.933 

9 

7 

6 

7 

26 

5.113 

5.493 

5.493 

16.933 

10 

8 

6 

8 

20 

6.113 

5.493 

6.113 

24.933 

11 

8 

6 

8 

28 

6.113 

5.493 

6.113 

15.933 

12 

9 

6 

9 

19 

7.113 

5.493 

7.113 

21.933 

13 

9 

5 

9 

25 

7.113 

4.493 

7.113 

11.933 

14 

10 

5 

10 

15 

8.113 

4.493 

8.113 

13.933 

15 

10 

5 

10 

17 

8.113 

4.493 

8.113 

15.933 

16 

10 

4 

10 

19 

8.113 

3.493 

8.113 

15.933 

17 

11 

4 

11 

19 

9.113 

3.493 

9.113 

14.933 

18 

11 

4 

11 

18 

9.113 

3.493 

9.113 

27.933 

19 

13 

4 

13 

31 

11.113 

3.493 

11.113 

13.933 

20 

14 

4 

14 

17 

12.113 

3.493 

12.113 

7.933 

mean 

8.05 

5.75 

8.85 

19.15 

6.163 

5.243 

7.484 

16.083 

standard dev. 

2.999 

1.333 

2.159 

5.204 

2.999 

1.333 

1.791 

5.204 


Table 2. Sample of recordings from a single cat SC: Columns 2 and 6 (V) are arranged by increasing order, 3 and 7 (A) by 
decreasing order. S.A. stands for “spontaneous activity” (4.26 spikes/s in this sample). Standard PSTHs were computed. 
Spontaneous activity was computed from the 500 ms preceding each stimulus onset (allowing at least 1500 ms between each 
trial). A threshold of mean S.A. rate per 10 ms bin plus 2 standard deviations was computed, only used to determine onset and 
offset. Response onset was defined when the first spike occurred within the bin that rises above this threshold and remained 
above for at least 3 bins. Offset was counted as the last spike in the bin just before the response fell back below this threshold 
and remained below for 3 bins. The response window (duration) is the time between onset and offset. Total number of spikes 
(left columns in the table) include all spikes within the response window, which will inevitably include some S.A. The right 
columns include responses with S.A. removed. The expected number of S.A. spikes within the given window (i.e., S.A. times 
window size in seconds) was removed. This is never an integer and can sometimes cause negative values on some trials. This 
number represents “change from baseline firing” (information obtained from M. T. Wallace, personal communication, July 18, 
2015) 

Comparing CRE SP and CRE SP for n = 27 recording blocks 

The total data set comprised 84 recording blocks from 20 SC cells of length 15 each, where the number of spikes to visual- 
auditory stimulation was found significantly larger than the maximum of responses to unisensory stimulation, according to 
the categorization from the Wallace lab. In 57 of these blocks, there was no response at all from either visual or auditory 
stimulation. For those cases, CRE« p = CREsp by definition, so comparison is void. The data from the remaining 27 recording 
blocks were available for comparing both indexes. 

In order to obtain confidence interval estimates for the difference between CRE S p and CREsp, each of these 27 blocks 
underwent a bootstrap procedure, i.e., 10,000 random samples of N = 15 were taken with replacement from the sets of spike 
frequencies for visual (V), auditory (A), and bimodal (VA) stimulation. For each sample, both CRE sp and CREsp were 
computed yielding a 95% confidence interval for their difference in each of the 27 recording blocks. The points of Figure 1 
depict pairs of bootstrap estimates of (CREsp, CREjTp). In the left panel (with spontaneous activity retained), there were 4 out 
of 27 cases with no significant difference between both measures (red color), after spontaneous activity was removed, only 1 
out of 19 cases was not significant (see right panel). In the latter, the number of possible comparisons decreased to 19 because 
in the other blocks there was no activity left for one of the unisensory conditions. In summary, this arguably limited data set 
supports the observation that many neurons previously labeled “multisensory” will no longer be categorized as such. 
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Figure 1 . Bootstrapped values (logarithmic scale) of CRE sp vs. CREsp based on 10,000 samples (in right-hand panel, 
spontaneous activity was removed). Except for 4 out of 27 cases (left panel) and 1 out of 19 cases (red points), the new index 
was significantly smaller than the traditional one (95% confidence intervals, too small to be shown on the graphs). 


Discussion and Conclusion 

The issue of how to quantify crossmodal response enhancement due to the occurrence of multisensory integration has been under 
discussion in both behavioral and neurophysiological research. The most widely used index up to now expresses crossmodal 
enhancement as a proportion of the strongest unisensory response. It has descriptive value but lacks a theoretical basis. Such a 
foundation is essential because, as widely acknowledged in both reaction time and neural studies, being responsive to multiple 
sensory modalities does not guarantee that the response has been generated by actually integrating the multiple sensory inputs, 
rather than simply responding to the most salient stimulus modality. Here we suggest a new index that measures by how much 
the crossmodal response surpasses the level obtainable by optimally combining the unisensory responses. Optimality is achieved 
by using a probability summation mechanism that combines the unisensory responses with maximal negative dependence. 
Importantly, no claim is made that the system actually operates under this mechanism, it only serves as well-defined benchmark 
against which to gauge the crossmodal response. 

It has been demonstrated here that the new index can be defined in a consistent manner for both studying reaction times and 
responses by single neurons (spike frequencies). Whereas the index is closely linked to the race model inequality, a widely used 
testing procedure for multisensory integration in reaction times, its application to neural responses has new and potentially 
important consequences: neurons previously labeled as “multisensory” may lose that property since the new index tends to 
yield smaller values for the amount of crossmodal enhancement. This was exemplified here with a data set collected from single 
SC neurons. The extent to which this holds more generally can only be determined by a large-scale investigation of a multitude 
of neurons from empirical studies. Obviously, at the level of a (sub-)population of neurons, such a relabeling may lead to a 
reassessment of the distribution of multisensory neurons and different types of unisensory neurons for that region. Moreover, 
studies probing the entire scope of the behavior of multisensory neurons, e.g. by looking at intrinsic differences in the dynamic 
range of these neurons (see 5 ), may come to different conclusion when using the new index. 

We also showed that the new index, CRE^, is easy to compute and does not require any specific assumption about the 
distribution of spikes. The special case of Poisson-distributed spikes was drawn upon to demonstrate that the new index is 
sensitive to the variability in the data, in contrast to the traditional index which by definition only depends on the means of the 
uni- and crossmodal response distributions. 

It is worth mentioning that the new approach can also be applied to an alternative measure, comparing cross-modal responses 
to the sum of the unisensory responses (“additive model”) (see also 30 ). From 31 (and more recent papers in actuarial statistics), 
it is possible to compute the maximally achievable sum of two random variables and, using the same logic as for computing 
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CREgp , cross-modal responses can be compared with the response level obtainable by adding the unisensory responses in an 
optimal way. 

Future research should address a number of issues. For example, is the new index consistent with the “inverse effectiveness 
rule” of multisensory integration, stating that crossmodal response enhancement decreases with the intensity of the stimuli 
applied ? Preliminary reasoning suggests that both CREsp and CRE sp are consistent with this “rule”. Only the latter, however, 
seems also sensitive to an increase in the intensity of the modality to which the neuron is less responsive. 

Another issue is whether the logic of the new index can be extended to more than two modalities? Such a generalization is 
not straightforward given that maximal negative dependence among three random variables is strongly limited. On a broader 
level, it would be interesting to explore whether the new index, or at least its logic, could be utilized beyond the level of single 
neuron responses, possibly including data from functional magnetic resonance studies. 32 As the authors of a recent review 33 
put it, “ ..., an enhanced BOFD response for multisensory relative to unisensory stimulation can be due to “true” multisensory 
neurons integrating stimulation from two or more sensory modalities, but it can just as well be explained by driving two 
unisensory sub-populations instead of one. If the latter scenario would be true, one might wrongly infer multisensory integration 
at the neuronal level.” 

Given the recent results by Miller et al., 10 showing “...that the integration of temporally displaced sensory responses is also 
highly dependent on the relative efficacies with which they drive their common target neuron”, one may also more generally 
question the usefulness of any static measure of crossmodal enhancement, and this may lead to implementing a temporal 
dimension to a quantitative index of crossmodal enhancement. 
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